9 research outputs found

    A neural network architecture for data editing in the Bank of ItalyÂ’s business surveys

    Get PDF
    This paper presents an application of neural network models to predictive classification for data quality control. Our aim is to identify data affected by measurement error in the Bank of ItalyÂ’s business surveys. We build an architecture consisting of three feed-forward networks for variables related to employment, sales and investment respectively: the networks are trained on input matrices extracted from the error-free final survey database for the 2003 wave, and subjected to stochastic transformations reproducing known error patterns. A binary indicator of unit perturbation is used as the output variable. The networks are trained with the Resilient Propagation learning algorithm. On the training and validation sets, correct predictions occur in about 90 per cent of the records for employment, 94 per cent for sales, and 75 per cent for investment. On independent test sets, the respective quotas average 92, 80 and 70 per cent. On our data, neural networks perform much better as classifiers than logistic regression, one of the most popular competing methods, on our data. They appear to provide a valid means of improving the efficiency of the quality control process and, ultimately, the reliability of survey data.data quality, data editing, binary classification, neural networks, measurement error

    Remote processing of firm microdata at the Bank of Italy

    Get PDF
    Providing the possibility to run personalised econometric/statistical analyses on the appropriate data sets by remote processing allows greater flexibility in the production of economic information. Binding confidentiality requirements are required with business survey data. The Bank of Italy's infrastructure allows its business survey data to be exploited, while preserving anonymity of individual data. The system is based on the LISSY platform and has been already adopted by the Luxembourg Income Study (LIS) and other research centres. Firms' privacy is safeguarded by forbidding potentially confidentiality-breaking programme statements and by denying the visualisation of individual data. Data confidentiality is protected by removing key identifiers from the database and by trimming data in the right tail of the distribution. The platform provides its services through plain-text e-mails. The authorised user sends an e-mail containing an identifying header followed by a statistical programme to a predetermined address. The system checks the validity of the header, strips out the code and submits it in a batch to one of the econometric/statistical packages available (SAS and Stata). The outputs are mailed back to the user after passing an array of automatic and manual checks.microdata, confidentiality, remote access
    corecore